Kubernetes Volume Snapshots
The Kubernetes Volume Snapshot feature provides a standard way to preserve the state of a PersistentVolume (PV) at a specific point in time. This facilitates database backups and data replication to development environments.
Overview
Volume Snapshot is a feature that creates a copy of a volume on a storage system. It became GA (Generally Available) in Kubernetes v1.20. It can be used if the CSI (Container Storage Interface) driver supports the snapshot feature. Azure storage services such as Azure Disk and Azure Files also support this feature through CSI drivers.
Key Use Cases
-
Backup and Restore:
- Regularly protect database and application data.
- Restore volumes from a specific snapshot in case of failure or data corruption.
-
Data Replication (Cloning):
- Copy production data to development/test environments for troubleshooting or testing new features.
- Create replicas for analyzing large datasets.
-
Application Migration:
- Can be used as an intermediate step when moving data from one cluster to another (although moving the snapshot itself may require additional steps or tools).
Architecture and Key Resources
The Volume Snapshot feature consists of the following three main API resources.
1. VolumeSnapshotClass
Similar to StorageClass, this is a cluster-level resource that specifies parameters and drivers for creating snapshots. It is defined by administrators.
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-azuredisk-vsc
driver: disk.csi.azure.com
deletionPolicy: Delete
parameters:
incremental: "true" # Example parameter specific to Azure Disk
2. VolumeSnapshot
A Namespaced resource created by users to request the creation of a snapshot. It specifies which PersistentVolumeClaim (PVC) to snapshot.
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: my-snapshot
namespace: default
spec:
volumeSnapshotClassName: csi-azuredisk-vsc
source:
persistentVolumeClaimName: my-pvc
3. VolumeSnapshotContent
A cluster-level resource representing the actual snapshot on the storage system. Similar to the relationship between PV and PVC, VolumeSnapshot and VolumeSnapshotContent are bound together. It is usually dynamically provisioned.
Implementation Example: Restore from Snapshot
To create (restore) a new PVC from a snapshot, use the dataSource field in the PVC definition.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restore-pvc
namespace: default
spec:
storageClassName: managed-csi
dataSource:
name: my-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Best Practices
-
Application Quiescing:
- File system level snapshots may only guarantee "crash consistency". For databases, it is recommended to flush in-memory data to disk and pause (freeze) writes before taking a snapshot to ensure "application consistency".
- Use Hooks or tools (e.g., Velero) to automate pre/post-snapshot processing.
-
Appropriate DeletionPolicy:
- If
deletionPolicyinVolumeSnapshotClassis set toDelete, the snapshot on storage will be deleted when theVolumeSnapshotobject on Kubernetes is deleted. - Consider
Retainfor critical backups and design operations to prevent accidental deletion.
- If
-
Regular Backup Operations:
- Implement automated snapshot creation and generation management using CronJob or backup tools (Velero, Kasten K10, etc.) instead of just manual creation.
-
Region/Zone Redundancy:
- Azure Disk snapshots are stored in the same region as the source disk. For Disaster Recovery (DR), consider using mechanisms to copy snapshots to another region (e.g., Azure Backup).
Considerations
- CSI Driver Requirements: The Kubernetes cluster and storage CSI driver used must support the Volume Snapshot feature.
- Capacity and Cost: Snapshots consume storage capacity. Be mindful of costs, especially if not using incremental snapshots or if change frequency is high. Regularly delete unnecessary snapshots.